Pseudocode is the term used to describe a draft outline of a program written in plain English (or whatever language you write it in :-) ). We use pseudocode to discuss the functionality of the program as well as key elements in the program. Starting a program by using pseudocode can help to get your logic down quickly without having to be concerned with hte exact details or syntax of the programming language.
In [1]:
# This sequence is the first 100 nucleotides of the Influenza H1N1 Virus segment 8
flu_ns1_seq = 'GTGACAAAGACATAATGGATCCAAACACTGTGTCAAGCTTTCAGGTAGATTGCTTTCTTTGGCATGTCCGCAAACGAGTTGCAGACCAAGAACTAGGTGA'
Pseudocode:
NOTE: Please get into teh good habit of commenting your code and describing what you are going to do or are doing. There must be at least one comment in your code.
In [3]:
from __future__ import division
# Write your code here (if you wish)
flu_ns1_seq_upper = flu_ns1_seq.upper()
# Count the number of "C"s in the above sequence
c_count = flu_ns1_seq_upper.count('C')
# Count the number of "G"s in the above sequence
g_count = flu_ns1_seq_upper.count('G')
# Add "C" and "G" counts together
g_c_count = c_count + g_count
# Count the total number of nucleotides in the sequence
sequence_length = len(flu_ns1_seq_upper)
# Divide the total number of "C" and "G" nucleotides by the total number of nucleotides
gc_percentage = g_c_count / sequence_length
# Print the percentage
print(gc_percentage)
If you would like to create a file with your source doe paste it in the cell below and run. Please remember to add your name to the file.
In [9]:
%%writefile GC_calculator.py
from __future__ import division
flu_ns1_seq = 'GTGACAAAGACATAATGGATCCAAACACTGTGTCAAGCTTTCAGGTAGATTGCTTTCTTTGGCATGTCCGCAAACGAGTTGCAGACCAAGAACTAGGTGA'
# Write your code here (if you wish)
flu_ns1_seq_upper = flu_ns1_seq.upper()
# Count the number of "C"s in the above sequence
c_count = flu_ns1_seq_upper.count('C')
# Count the number of "G"s in the above sequence
g_count = flu_ns1_seq_upper.count('G')
# Add "C" and "G" counts together
g_c_count = c_count + g_count
# Count the total number of nucleotides in the sequence
sequence_length = len(flu_ns1_seq_upper)
# Divide the total number of "C" and "G" nucleotides by the total number of nucleotides
gc_percentage = g_c_count / sequence_length * 100
# Print the percentage
print(gc_percentage)
In [10]:
!python GC_calculator.py
Note: Later in the course we will look at the biopython package that included the capability to compute CG percentage.